home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Speccy ClassiX 1998
/
Speccy ClassiX 98.iso
/
amiga_system
/
the_aminet
/
dev
/
gcc
/
ixemulsdk.lha
/
man
/
cat3
/
db.0
< prev
next >
Wrap
Text File
|
1992-08-10
|
17KB
|
331 lines
btree_open, hash_open, recno_open - database access methods
##iinncclluuddee <<ssyyss//ttyyppeess..hh>>
##iinncclluuddee <<ddbb..hh>>
DDBB **
bbttrreeee__ooppeenn((ccoonnsstt cchhaarr **ffiillee,, iinntt ffllaaggss,, iinntt mmooddee,,
ccoonnsstt BBTTRREEEEIINNFFOO ** ooppeenniinnffoo));;
DDBB **
hhaasshh__ooppeenn((ccoonnsstt cchhaarr **ffiillee,, iinntt ffllaaggss,, iinntt mmooddee,,
ccoonnsstt HHAASSHHIINNFFOO ** ooppeenniinnffoo));;
DDBB **
rreeccnnoo__ooppeenn((ccoonnsstt cchhaarr **ffiillee,, iinntt ffllaaggss,, iinntt mmooddee,,
ccoonnsstt RREECCNNOOIINNFFOO ** ooppeenniinnffoo));;
and are access method interfaces to database files in btree,
hashed, and flat¡file formats, respectively. The btree format is
a representation of a sorted, balanced tree structure. The
hashed format is an extensible, dynamic hashing scheme. The
flat¡file format is a UNIX file with fixed or variable length
lines. These formats are described in more detail below. Access
to all file types is based on key/data pairs. Each routine opens
for reading and/or writing. Databases never intended to be pre¡
served on disk may be created by setting the file parameter to
NULL. The and are as specified to the routine, however, only the
O_CREAT, O_EXCL, O_RDONLY, O_RDWR, O_TRUNC and O_WRONLY flags are
meaningful. The argument is a pointer to an access method spe¡
cific structure described below. The open routines return a
pointer to a DB structure on success and NULL on error. The DB
structure contains at least the following fields:
typedef struct {
int (*close)(const DB *db);
int (*sync)(const DB *db);
int (*del)(const DB *db, const DBT *key, u_int flags);
int (*get)(const DB *db, DBT *key, DBT *data, u_int flags);
int (*put)(const DB *db, const DBT *key, const DBT *data,
u_int flags);
int (*seq)(const DB *db, DBT *key, DBT *data, u_int flags);
int type;
void *openinfo;
} DB;
The elements of this structure consist of a pointer to an access
method specific structure and a set of routines which perform
various functions. All of these routines take a pointer to a
structure as returned by one of the open routines, one or more
pointers to key/data structures, and, optionally, a flag value.
openinfo A pointer to an internal structure specific to the ac¡
cess method. type The type of the underlying access method; ei¡
ther DB_BTREE, DB_HASH or DB_RECNO. close A pointer to a routine
to flush any cached information to disk, free any allocated re¡
sources, and close the database file. Since key/data pairs may
be cached in memory, failing to close the file with a routine may
result in inconsistent or lost information. routines return ¡1
on error (setting and 0 on success. del A pointer to a routine
to remove key/data pairs from the database. routines return ¡1
on error (setting 0 on success, and 1 if the specified was not in
the file. get A pointer to a routine which is the interface for
keyed retrieval from the database. The address and length of the
data associated with the specified are returned in the structure
referenced by routines return ¡1 on error (setting 0 on success,
and 1 if the was not in the file. put A pointer to a routine to
store key/data pairs in the database. The parameter must be set
to one of the following values: R_IAFTER Append the data immedi¡
ately after the data referenced by creating a new key/data pair.
(This implies that the access method is able to create new keys,
i.e. the keys are ordered and independent, for example, record
numbers. Applicable only to the access method.) R_IBEFORE In¡
sert the data immediately before the data referenced by creating
a new key/data pair. (This implies that the access method is
able to create new keys, i.e. the keys are ordered and indepen¡
dent, for example, record numbers. Applicable only to the access
method.) R_NOOVERWRITE Enter the new key/data pair only if the
key does not previously exist. R_PUT Enter the new key/data pair
and replace any previously existing key. routines return ¡1 on
error (setting 0 on success, and 1 if the R_NOOVERWRITE was set
and the key already exists in the file. seq A pointer to a rou¡
tine which is the interface for sequential retrieval from the
database. The address and length of the key are returned in the
structure referenced by and the address and length of the data
are returned in the structure referenced by Sequential key/data
pair retrieval may begin at any time, and the position of the
``cursor'' is not affected by calls to the or routines. Modifi¡
cations to the database during a sequential scan will be reflect¡
ed in the scan, i.e. records inserted behind the cursor will not
be returned while records inserted in front of the cursor will be
returned. The flag value must be set to one of the following
values: R_CURSOR The data associated with the specified key is
returned. This differs from the routines in that it sets the
``cursor'' to the location of the key as well. (This implies
that the access method has a implicit order which does not
change. Applicable only to the and access methods.) R_FIRST The
first key/data pair of the database is returned. R_LAST The last
key/data pair of the database is returned. (This implies that
the access method has a implicit order which does not change.
Applicable only to the and access methods.) R_NEXT Retrieve the
key/data pair immediately after the key/data pair most recently
retrieved using the routine. The cursor is moved to the returned
key/data pair. If is set to R_NEXT the first time the routine is
called, the first key/data pair of the database is returned.
R_PREV Retrieve the key/data pair immediately before the key/data
pair most recently retrieved using the routine. The cursor is
moved to the returned key/data pair. If is set to R_PREV the
first time the routine is called, the last key/data pair of the
database is returned. (This implies that the access method has a
implicit order which does not change. Applicable only to the and
access methods.) routines return ¡1 on error (setting 0 on suc¡
cess, 1 if there are no more key/data pairs available. If the
access method is being used, and if the database file is a char¡
acter special file and no complete key/data pairs are currently
available, the routines return 2. sync A pointer to a routine to
flush any cached information to disk. If the database is in mem¡
ory only, the routine has no effect and will always succeed.
routines return ¡1 on error (setting and 0 on success. Access to
all file types is based on key/data pairs. Both keys and data
are represented by the following data structure: typedef struct {
void *data;
size_t size; } DBT; The elements of the DBT structure are defined
as follows: data A pointer to a byte string. size The length of
the byte string. Key/data strings must fit into available memo¡
ry. One of the access methods is a btree: a sorted, balanced
tree structure with associated key/data pairs. The access method
specific data structure provided to is as follows: typedef struct
{ u_long flags;
u_int psize;
u_int cachesize;
int (*compare)(const void *, const void *);
int lorder; } BTREEINFO; The elements of this structure are de¡
fined as follows: flags The flag value is specified by any of the
following values: R_DUP On insertion, if the key to be inserted
already exists, permit insertion anyway. This flag permits du¡
plicate keys in the tree. By default, duplicates are not permit¡
ted, and attempts to insert them will fail. Note, the order of
retrieval of key/data pairs with duplicate keys is undefined.
cachesize A suggested maximum size, in bytes, of the memory
cache. Setting this value to zero specifies that an appropriate
amount of memory should be used. Since every search examines the
root page of the tree, caching the most recently used pages sub¡
stantially improves access time. In addition, physical writes
are delayed as long as possible, so a moderate cache can reduce
the number of I/O operations significantly. Obviously, using a
cache increases the likelihood of corruption or lost data if the
system crashes while a tree is being modified. However, caching
10 pages decreases the creation time of a large tree by between
two and three orders of magnitude. compare Compare is a user de¡
fined comparison function. It must return an integer less than,
equal to, or greater than zero if the first argument is consid¡
ered to be respectively less than, equal to, or greater than the
second. The same comparison function must be used on a given
tree every time it is opened. If no comparison function is spec¡
ified, is used. lorder The byte order for 4¡byte integers in the
stored database metadata. The number should represent the order
as an integer; for example, big endian order would be the number
4,321. If is 0 (no order is specified) the current host order is
used. If the file already exists, the specified value is ig¡
nored and the value specified when the tree was created is used.
(Obviously, portability of the data forming the key/data pairs is
the concern of the application program.) psize Page size is the
size in bytes of the pages used for nodes in the tree. If the
file already exists, the specified value is ignored and the value
specified when the tree was created is used. If is zero, an ap¡
propriate page size is chosen (based on the system memory and/or
file system constraints), but will never be less than 512 bytes.
If the pointer to the data structure is NULL, the routine will
use appropriate values. If the database file already exists, and
the O_TRUNC flag is not specified to the parameter ignored. Key
structures may reference byte strings of slightly less than one¡
half the tree's page size only (see Data structures may reference
byte strings of essentially unlimited length. Searches, inser¡
tions, and deletions in a btree will all complete in O lg N.
Forward sequential scans of a tree are from the least key to the
greatest. Space freed up by deleting key/data pairs from a btree
is never reclaimed, although it is normally made available for
reuse. The exception to this is that space occupied by large da¡
ta items (those greater than one quarter the size of a page) is
neither reclaimed nor reused. This means that the btree storage
structure is grow¡only. The only solutions are to avoid exces¡
sive deletions, or to create a fresh tree periodically from a
scan of an existing one. One of the access methods is hashed ac¡
cess and storage. The access method specific data structure pro¡
vided to is as follows:
typedef struct { u_long (*hash)(const void *, const size_t);
u_int cachesize;
int bsize;
int ffactor;
int lorder;
int nelem; } HASHINFO; The elements of this structure are defined
as follows: bsize defines the hash table bucket size, and is, by
default, 256 bytes. It may be preferable to increase the page
size for disk¡resident tables and tables with large data items.
cachesize A suggested maximum size, in bytes, of the memory
cache. Setting this value to zero specifies that an appropriate
amount of memory should be used. ffactor indicates a desired
density within the hash table. It is an approximation of the
number of keys allowed to accumulate in any one bucket, determin¡
ing when the hash table grows or shrinks. The default value is
8. hash is a user defined hash function. Since no hash function
performs equally well on all possible data, the user may find
that the built¡in hash function does poorly on a particular data
set. User specified hash functions must take two arguments (a
pointer to a byte string and a length) and return an u_long to be
used as the hash value. lorder The byte order for 4¡byte inte¡
gers in the stored database metadata. The number should repre¡
sent the order as an integer; for example, big endian order would
be the number 4,321. If is 0 (no order is specified) the current
host order is used. If the file already exists, the specified
value is ignored and the value specified when the tree was creat¡
ed is used. (Obviously, portability of the data forming the
key/data pairs is the concern of the application program.) nelem
is an estimate of the final size of the hash table. If not set,
the default value is 1. If not set or set too low, hash tables
will expand gracefully as keys are entered, although a slight
performance degradation may be noticed. If the pointer to the
data structure is NULL, the routine will use appropriate values.
If the hash table already exists, and the O_TRUNC flag is not
specified to the parameters and are ignored. If a hash function
is specified, will attempt to determine if the hash function
specified is the same as the one with which the database was cre¡
ated, and will fail if it is not. Both key and data structures
may reference byte strings of essentially unlimited length.
Backward compatible interfaces to the routines described in and
are provided, however, these interfaces are not compatible with
previous file formats. One of the access methods is either vari¡
able or fixed¡length records, the former delimited by a specific
byte value. The access method specific data structure provided
to is as follows:
typedef struct { u_long flags;
u_int cachesize;
size_t reclen;
u_char bval; } RECNOINFO; The elements of this structure are de¡
fined as follows: flags The flag value is specified by any of the
following values: R_FIXEDLEN The records are fixed¡length, not
byte delimited. The structure element specifies the length of
the record, and the structure element is used as the pad charac¡
ter. R_SNAPSHOT This flag requires that a snapshot of the file
be taken when is called, instead of permitting any unmodified
records to be read from the original file. cachesize A suggested
maximum size, in bytes, of the memory cache. Setting this value
to zero specifies that an appropriate amount of memory should be
used. reclen The length of a fixed¡length record. bval The de¡
limiting byte to be used to mark the end of a record for vari¡
able¡length records, and the pad character for fixed¡length
records. Variable¡length and fixed¡length data files require
structures to reference the following structure:
typedef struct { u_long length;
u_long number;
u_long offset;
u_char valid; } RECNOKEY; The elements of this structure are de¡
fined as follows: length The length of the record. number The
record number. offset The offset in the file at which the record
is located. valid A flag value which indicates the validity of
the other fields in the structure. The flag value is specified
by one or more of the following values: R_LENGTH The record
length is valid. R_NUMBER The record number is valid. R_OFFSET
The byte offset is valid. If the record retrieval is successful,
the record number, byte offset and record length are set in the
RECNOKEY structure referenced by the caller's structure. Data
structures may reference byte strings of essentially unlimited
length. The routines may fail and set for any of the errors
specified for the library routines and or the following: [EFTYPE]
A file used by one of the routines is incorrectly formatted.
[EINVAL] A parameter has been specified (hash function, pad byte
etc.) that is incompatible with the current file specification or
there is a mismatch between the version number of file and the
software. The routines may fail and set for any of the errors
specified for the library routines or The and routines may fail
and set for any of the errors specified for the library routines
or The routines may fail and set for any of the errors specified
for the library routine Per¡Ake Larson, Communications of the
ACM, April 1988.
Margo Seltzer, USENIX Proceedings, Winter 1991. The typedef DBT
is a mnemonic for ``data base thang'', and was used because noone
could think of a reasonable name that wasn't already used. None
of the access methods provide any form of concurrent access,
locking, or transactions. Only big and little endian byte order
is supported.